Skip to main content

GeneratePress Notion Migration Log

This report documents the migration of the GeneratePress Notion export into Docusaurus, including preparation, issues encountered, solutions applied, scripts used, validation, and git delivery.

1. Objective

  • Migrate GeneratePress docs from Notion export into Docusaurus at docs/wordpress/generatepress.
  • Use the migration standards from docs/knowledge/docusaurus/11. Migration.
  • Build a full 11-module scaffold (user-selected option), including placeholders for missing lesson content.
  • Keep output build-safe for MDX.

2. Source and Destination

ItemPath
Source parent folder/home/rezriz/notion-export -to-docusaurus/generatepress
Source zip/home/rezriz/notion-export -to-docusaurus/generatepress/generatepress-notion.zip
Nested source zip/home/rezriz/notion-export -to-docusaurus/generatepress/ExportBlock-64e47d76-77f8-4c0b-b3ba-09a00a188625-Part-1.zip
Source root curriculum page/home/rezriz/notion-export -to-docusaurus/generatepress/Generatepress 28c05f1abc3d819d8669e7afbe72b428.mdx
Source curriculum outline page/home/rezriz/notion-export -to-docusaurus/generatepress/Generatepress/Curriculum outline 28f05f1abc3d8024b13efe41a69c93f6.mdx
Docusaurus target root/opt/docker-data/apps/docusaurus/site/docs/wordpress/generatepress
Migration guidance folder used/opt/docker-data/apps/docusaurus/site/docs/knowledge/docusaurus/11. Migration

3. Preparation Performed

3.1 Locate and extract source archives

updatedb
sudo updatedb
plocate generatepress-notion.zip

unzip -o "/home/rezriz/notion-export -to-docusaurus/generatepress/generatepress-notion.zip" \
-d "/home/rezriz/notion-export -to-docusaurus/generatepress"

unzip -o "/home/rezriz/notion-export -to-docusaurus/generatepress/ExportBlock-64e47d76-77f8-4c0b-b3ba-09a00a188625-Part-1.zip" \
-d "/home/rezriz/notion-export -to-docusaurus/generatepress"

3.2 Normalize source extensions

All source markdown files were converted from .md to .mdx in place before migration.

python3 - <<'PY'
from pathlib import Path
root = Path('/home/rezriz/notion-export -to-docusaurus/generatepress')
for p in root.rglob('*.md'):
p.rename(p.with_suffix('.mdx'))
PY

3.3 Review migration guidance

The following reference docs were read before implementation:

  • knowledge/docusaurus/11. Migration/notion-to-docusaurus.mdx
  • knowledge/docusaurus/11. Migration/rsync-notion-migration-log.mdx
  • knowledge/docusaurus/11. Migration/mysql-migration-and-quality-log.mdx
  • knowledge/docusaurus/11. Migration/wp-performance-notion-migration-log.mdx

4. Source Assessment

Initial source audit results:

MetricValue
Total source .mdx files65
Linked files from curriculum map64
Unique linked source files64
Real-content source pages detected17
Empty-wrapper source pages detected47

Notes:

  • Most Notion pages were wrapper-only (<aside>, Home block, repeated curriculum nav).
  • Real content was concentrated in a subset of Module 1, Module 2, and Module 7 pages.

5. Migration Strategy Executed

5.1 User-selected mode

  • Chosen mode: full 11-module scaffold.
  • This preserved every module and lesson from the outline, even when source body content was missing.

5.2 Curriculum-first mapping

  • Parsed module and lesson order from the curriculum outline page.
  • Mapped source lesson links from the root curriculum page.
  • Used deterministic slugs for folders and lesson files.

5.3 Target rebuild

  • Replaced old placeholder root file:
    • deleted wordpress/generatepress/index.md
    • created wordpress/generatepress/index.mdx
  • Rebuilt full module tree under wordpress/generatepress/.
  • Added _category_.json in every module with "link": { "type": "doc", "id": "index" }.

5.4 Cleanup and safety passes

  • Removed Notion artifacts (<aside>, Home icon/nav, duplicated curriculum block).
  • Normalized text to ASCII-safe output where possible.
  • Escaped problematic < outside fenced code blocks for MDX safety.
  • Added explicit placeholders where lesson source was missing or empty.
  • Copied image assets for the how-to page.

6. Issues Faced and Solutions

IssueSymptomRoot CauseSolution Applied
Notion wrapper noiseDocs contained duplicate nav blocks and wrappersNotion export embeds global nav and <aside> markupImplemented cleanup pass removing wrappers, Home icon block, repeated curriculum links
Sparse source coverageMany lessons had no body contentExport contained many empty pagesGenerated explicit placeholders and marked them in module indexes
Ambiguous curriculum linksSome links include parentheses and unusual titlesSimple regex can truncate URL/title parsingUsed robust parser with first ]( and trailing ) strategy
MDX parser riskRaw < can break buildContent included comparisons and HTML-like textEscaped < outside code fences (&lt;)
Leading separator noiseSome lesson files started with stray --- after frontmatterNotion cleanup left orphan separatorsRan a post-pass to strip leading blank/separator noise
Nested archive packagingFirst unzip did not expose final markdown pagesNotion export zip contained another zipPerformed second unzip on nested ExportBlock archive
Existing dirty worktreeUnrelated files were modified in repoPrior changes already presentStaged and committed only wordpress/generatepress/*

7. Scripts and Commands Used

7.1 One-off inline Python scripts used

  1. Source classifier script

    • Counted real-content vs empty-wrapper pages by checking meaningful text after the second <aside>.
  2. Curriculum parser and migration builder

    • Parsed outline modules and lessons.
    • Parsed root mapping links.
    • Rebuilt target folder tree.
    • Generated frontmatter and lesson/index docs.
    • Generated placeholders for missing content.
    • Copied assets for the how-to page.
  3. Post-clean script

    • Removed leading blank + separator noise at top of lesson bodies.

Representative command pattern used:

python3 - <<'PY'
from pathlib import Path

root = Path('/opt/docker-data/apps/docusaurus/site/docs/wordpress/generatepress')
for p in root.glob('[0-9][0-9]-*/*.mdx'):
if p.name == 'index.mdx':
continue
text = p.read_text(encoding='utf-8', errors='ignore')
parts = text.split('\n')
end = next((i for i in range(1, len(parts)) if parts[i].strip() == '---'), None)
if end is None:
continue
fm = parts[:end + 1]
body = parts[end + 1:]
while body and (body[0].strip() == '' or body[0].strip() == '---'):
body = body[1:]
p.write_text('\n'.join(fm) + '\n\n' + '\n'.join(body).rstrip() + '\n', encoding='utf-8')
PY

7.2 Validation commands used

# Check for leftover Notion wrappers
rg -n "<aside>|</aside>|notion.so/icons/home_" \
"/opt/docker-data/apps/docusaurus/site/docs/wordpress/generatepress"

# Docusaurus build verification
sudo docker exec docusaurus sh -lc "cd /app && npx docusaurus build --no-minify"

8. Final Output Metrics

MetricValue
Modules scaffolded11
Lesson docs generated69
Lessons with migrated source content8
Placeholder lessons61
Module index.mdx files11
Module _category_.json files11
Root and reference pages3
Total .mdx files in target83
Copied image assets (.png)3
Total files in target tree (.mdx + .json + .png)97

Placeholder distribution by module

Module FolderLessonsContentPlaceholders
01-introduction-and-environment-setup440
02-generatepress-core-structure523
03-customization-fundamentals505
04-elements-module505
05-generateblocks-integration505
06-site-library-and-templates505
07-child-theme20218
08-custom-css-php-and-hooks505
09-seo-schema-and-accessibility505
10-advanced-page-design505
11-advanced-developer-topics505

9. Validation Results

  • Notion wrapper artifact scan passed for target migration tree.
  • Docusaurus production build completed successfully.
  • Existing non-migration warnings remained (deprecated config/blog author warning), outside this migration scope.

10. Git Delivery

ItemValue
Commit hash9027041
Commit messagedocs: scaffold GeneratePress curriculum from Notion export
Branchmain
Remote pushorigin/main pushed successfully

Scope control during commit:

  • Only wordpress/generatepress/* was staged and committed.
  • Unrelated modified files (for WP Performance docs) were intentionally left uncommitted.

11. Re-run Checklist for Future Agents

  1. Extract all nested Notion archives before content parsing.
  2. Parse curriculum outline first; do not rely on filename sorting.
  3. Parse root mapping links with a robust delimiter strategy.
  4. Keep placeholder policy explicit and user-approved.
  5. Run MDX safety pass for < outside code fences.
  6. Run Docusaurus build in container before final handoff.
  7. Commit migration subtree only if worktree is dirty.